Skip to main content

Are you on the right long-term path? Get a full financial assessment

Get a full financial assessment
← Back to O Definitions

Operational resilience",

Operational Resilience

Operational resilience refers to an organization's ability to deliver its critical operations through disruption. It is a fundamental concept within Risk Management, focusing on the outcome of preventing, responding to, recovering from, and learning from operational disruptions. While traditionally related to Operational Risk management, operational resilience emphasizes the continuity of essential services, even when faced with severe but plausible events. The goal is to ensure the ongoing provision of critical functions to protect consumers, preserve market integrity, and maintain Financial Stability.

History and Origin

The concept of operational resilience has evolved significantly, particularly in the financial sector, driven by a series of global disruptions, including natural disasters, cyberattacks, and the COVID-19 pandemic. Regulators worldwide recognized that while firms had robust Crisis Management and disaster recovery plans, these often focused on recovering individual systems rather than ensuring the uninterrupted delivery of critical services to customers and markets.

This shift in focus led to the development of specific regulatory frameworks. For example, the Basel Committee on Banking Supervision (BCBS), a global standard-setter for prudential regulation, issued its "Principles for operational resilience" in March 2021. These principles aim to strengthen banks' ability to withstand operational risk-related events that could cause significant operational failures or wide-scale disruptions in financial markets.4 Similarly, in the United Kingdom, the Financial Conduct Authority (FCA) published its final rules on "Building operational resilience" (PS21/3) in March 2021, setting out new requirements for financial services firms to identify important business services and set "impact tolerances" for disruptions.3 In the United States, the Federal Reserve Board, alongside other agencies, issued "Sound Practices to Strengthen Operational Resilience" in October 2020, providing guidance for large and complex firms.2 The increasing focus from global regulators underscores the critical importance of operational resilience in safeguarding the financial system against unforeseen shocks.1

Key Takeaways

  • Operational resilience focuses on ensuring the continuous delivery of critical business services, even amidst severe disruptions.
  • It goes beyond traditional Disaster Recovery by emphasizing the impact on external customers and market integrity.
  • Regulatory bodies globally have introduced frameworks and guidance to enhance operational resilience within the financial sector.
  • Key components include identifying important business services, setting Impact Tolerance levels, mapping interdependencies, and robust Scenario Testing.
  • Effective Governance is crucial for overseeing and implementing an organization's operational resilience approach.

Interpreting Operational Resilience

Interpreting operational resilience involves assessing an organization's capacity to continue delivering its most important services under adverse conditions. This is not simply about whether a system can be restored, but whether the outcome of that system (the service it provides) can be maintained or quickly resumed to acceptable levels. Organizations define "important business services" (or similar terms) that, if disrupted, could cause intolerable harm to consumers, threaten market integrity, or impact their financial viability. For each of these services, an Impact Tolerance is set, which is the maximum tolerable duration or scope of disruption. This involves understanding the interconnectedness of people, processes, technology, facilities, and information. Regular Scenario Testing is then conducted to confirm the organization's ability to remain within these defined tolerances.

Hypothetical Example

Consider a large online brokerage firm that offers a critical service: real-time stock trading. The firm identifies this as an important business service. Through its operational resilience framework, it determines an Impact Tolerance of "no more than 30 minutes of complete outage" for its trading platform, as extended downtime could lead to significant financial losses for customers and erode market confidence.

To achieve this, the firm maps all underlying dependencies, including its trading software, data centers, network infrastructure, personnel responsible for system monitoring, and key Third-Party Risk Management vendors providing cloud services and market data feeds.

During a routine Scenario Testing exercise, the firm simulates a denial-of-service attack on its primary data center. While the attack initially causes system slowdowns, the firm's robust operational resilience measures, including redundant systems and automated failover protocols, ensure that trading services are rerouted to a secondary data center within 15 minutes. This action allows the firm to remain within its 30-minute impact tolerance, demonstrating effective operational resilience. After the exercise, a review is conducted to identify any lessons learned and further strengthen the resilience of the trading platform.

Practical Applications

Operational resilience is a critical concern across various sectors, particularly within financial services, due to the interconnected nature of markets and the potential for Systemic Risk. Its practical applications include:

  • Financial Institutions: Banks, investment firms, and exchanges implement operational resilience frameworks to protect essential services like payment processing, trading platforms, and customer account access from disruptions caused by IT failures, Cybersecurity Risk incidents, or other events. This helps maintain market integrity and consumer trust.
  • Regulatory Frameworks: Regulators worldwide mandate operational resilience programs to ensure firms can absorb shocks and continue functioning. This includes requirements for identifying critical services, setting impact tolerances, mapping resources, and conducting rigorous Scenario Testing.
  • Supply Chain Management: As organizations increasingly rely on third-party providers for critical services, operational resilience extends to managing the resilience of these external dependencies. This involves robust due diligence and contractual agreements to ensure vendor resilience.
  • Critical Infrastructure: While often associated with finance, operational resilience principles apply to any sector reliant on continuous service delivery, such as utilities, healthcare, and telecommunications, to ensure essential services are maintained for the public.
  • Enterprise Risk Management: Operational resilience integrates with broader Enterprise Risk Management strategies by providing a focused lens on the continuity of critical business functions, complementing financial risk and credit risk assessments.

Limitations and Criticisms

While essential, operational resilience frameworks have certain limitations and face criticisms. One challenge is the inherent difficulty in anticipating every conceivable severe but plausible disruption. Organizations might focus on known threats, potentially overlooking emerging or "black swan" events. Defining and measuring "impact tolerance" can also be subjective, and setting appropriate thresholds requires careful judgment, potentially leading to underestimation of actual disruption impacts.

The extensive mapping of interdependencies, especially those involving complex supply chains and numerous Third-Party Risk Management relationships, can be resource-intensive and challenging to maintain. Firms must consistently invest in technologies and processes to support their operational resilience efforts, and the effectiveness of these investments can be hard to quantify directly. Some critiques suggest that while regulatory focus on operational resilience is necessary, it can lead to a compliance-driven approach rather than a genuine shift in an organization's underlying culture of preparedness and Contingency Planning. The constant evolution of threats, particularly in Cybersecurity Risk, means that operational resilience is not a static state but an ongoing process requiring continuous adaptation and review.

Operational Resilience vs. Business Continuity Planning

While often used interchangeably or seen as closely related, operational resilience and Business Continuity Planning (BCP) have distinct focuses.

FeatureOperational ResilienceBusiness Continuity Planning (BCP)
Primary FocusDelivering critical outcomes/services through disruption.Resuming operations or restoring systems after disruption.
ScopeBroader; considers end-to-end service delivery, including third parties and external dependencies.Typically narrower; focuses on internal processes, IT systems, and recovery time objectives (RTOs).
GoalMinimize impact of disruption on customers, markets, and firm viability.Minimize downtime and restore business functions as quickly as possible.
PerspectiveOutcome-driven; from the perspective of the service recipient.Process/resource-driven; from the perspective of the organization's functions.
Key MetricImpact Tolerance (maximum acceptable disruption duration/scope).Recovery Time Objective (RTO) and Recovery Point Objective (RPO).

Operational resilience builds upon BCP by elevating the focus from simply recovering internal operations to ensuring that the vital services delivered to external stakeholders remain available within predefined impact tolerances. BCP is a critical component and tool within a comprehensive operational resilience framework.

FAQs

What is the main difference between operational resilience and traditional disaster recovery?

Traditional Disaster Recovery focuses on restoring IT systems and infrastructure after an outage. Operational resilience, on the other hand, centers on the continued availability of critical business services, regardless of the underlying cause or specific system failure. It's about maintaining the outcome for the customer or market, rather than just restoring a specific component.

Why is operational resilience particularly important in the financial sector?

The financial sector is highly interconnected, meaning a disruption at one firm can quickly cascade and affect others, potentially leading to Systemic Risk and wider financial instability. Operational resilience helps safeguard this interconnectedness and ensures essential services like payments, trading, and lending continue, protecting consumers and markets.

What is an "impact tolerance" in operational resilience?

An Impact Tolerance is the maximum acceptable level of disruption to an important business service. It defines how long or how severely a service can be disrupted before the impact becomes intolerable for consumers, the market, or the firm itself. Setting these tolerances is a core part of an operational resilience framework.

How do organizations measure operational resilience?

Organizations measure operational resilience by identifying their important business services, setting Impact Tolerance for these services, and then regularly conducting Scenario Testing to verify their ability to remain within those tolerances during severe but plausible disruptions. The measurement focuses on the continuity of the service rather than just the uptime of individual systems.

What role does third-party risk management play in operational resilience?

In today's interconnected financial ecosystem, many critical services rely on external providers. Third-Party Risk Management is crucial for operational resilience because organizations must ensure that their key vendors also have robust resilience capabilities. A disruption at a third-party provider can directly impact an organization's ability to deliver its own critical services.

AI Financial Advisor

Get personalized investment advice

  • AI-powered portfolio analysis
  • Smart rebalancing recommendations
  • Risk assessment & management
  • Tax-efficient strategies

Used by 30,000+ investors